Robust Regression with Projection Based M-estimators

نویسندگان

  • Haifeng Chen
  • Peter Meer
چکیده

The robust regression techniques in the RANSAC family are popular today in computer vision, but their performance depends on a user supplied threshold. We eliminate this drawback of RANSAC by reformulating another robust method, the M-estimator, as a projection pursuit optimization problem. The projection based pbM-estimator automatically derives the threshold from univariate kernel density estimates. Nevertheless, the performance of the pbM-estimator equals or exceeds that of RANSAC techniques tuned to the optimal threshold, a value which is never available in practice. Experiments were performed both with synthetic and real data in the affine motion and fundamental matrix estimation tasks. 1. An Analysis of Robust Regression Robust regression is the generic name of techniques which estimate a parametric regression model in the presence of significant number of data points not belonging to that model, i.e., outliers. The ‘secret’ of robust estimation is the use of valid additional assumptions about the data. The scale of the data of interest, i.e., a measure of the noise corrupting the inliers (such as standard deviation or range), is the most frequently used additional assumption. The robust regression technique most popular in computer vision RANSAC [4] and its improved versions MSAC and MLESAC [15], [16], impose an upper bound on the scale, and the parameter estimates are found bymaximizing the number of points (inliers) which can be placed within this bound. The additional assumption behind the least median of squares (LMedS) estimator [12] and similar techniques is equivalent. A lower bound is imposed on the required percentage of inliers in the data, and the parameter estimates are found by minimizing the scale of data subsets of this size. In real applications, however, often there is not enough a priori knowledge to reliably define additional information. Embedding the robust estimator into a second optimization process over the range of possible bounds, e.g., [10], is not a general enough solution. Indeed, whenever the employed assumptions are not valid, the robust regression may yield erroneous results, which in turn can corrupt the comparison across different operating conditions. The technique described in this paper does not require the user to provide any scale estimate, instead it exploits an intrinsic relation between the optimization criterion and the data space. Probabilistic sampling is the search technique of choice to minimize the optimization criterion of the robust regressions in the RANSAC family and LMedS. Elemental subsets containing the smallest number of data points which uniquely define a model parameter candidate are drawn without replacement from the data. The quality of the candidate is then assessed using all the data points and the final estimate is found by comparison over a number of such candidates [12, p.198]. This number, however, becomes unfeasible large when the outliers dominate in the data or the data is high dimensional. A possible solution is guided sampling, in which additional information about the probability of a data point being an inlier is integrated into the sampling process, e.g., [14]. It is very important to recognize that while probabilistic sampling is a computational tool with no relation to the optimization criterion, guided sampling is a robust procedure since it also exploits additional information. But reliable information to guide sampling cannot be guaranteed in many computer vision applications, and the robust regression method proposed in this paper shifts the emphasis from sampling in the input space to an efficient search in the space of the parameters. In Section 2 by reformulating the M-estimator optimization criterion, we introduce a new generic technique, the pbM-estimator, which does not require the user to provide the scale estimate. A multidimensional direct search technique using simplex, described in Section 3, keeps the computational burden at a satisfactory level. In Section 4 the performance of pbM-estimator is compared to techniques from the RANSAC family for synthetic data and two vision applications: affine motion and fundamental matrix estimation. 2. M-estimate Computation Using Projection Pursuit The principle behind the method discussed in this section was first proposed in [1], as part of a different approach limited to low-dimensional data. See also Section 5. Here we provide the implementation for arbitrary dimensional data, and introduce a new robust regression technique, the pbM-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Projection Estimators for Generalized Linear Models

We introduce a new class of robust estimators for generalized linear models which is an extension of the class of projection estimators for linear regression. These projection estimators are defined using an initial robust estimator for a generalized linear model with only one unknown parameter. We found a bound for the maximum asymptotic bias of the projection estimator caused by a fraction ε ...

متن کامل

A Robust Dispersion Control Chart Based on M-estimate

Process control charts are proven techniques for improving quality. Specifying the control limits is the most important step in designing a control chart. The presence of outliers may extremely affect the estimates of parameters using classical methods. Robust estimators which are not affected by outliers or the small departures from the model assumptions are applied in this paper to specify th...

متن کامل

Fuzzy Robust Regression Analysis with Fuzzy Response Variable and Fuzzy Parameters Based on the Ranking of Fuzzy Sets

‎Robust regression is an appropriate alternative for ordinal regression when outliers exist in a given data set‎. ‎If we have fuzzy observations‎, ‎using ordinal regression methods can't model them; In this case‎, ‎using fuzzy regression is a good method‎. ‎When observations are fuzzy and there are outliers in the data sets‎, ‎using robust fuzzy regression methods are appropriate alternatives‎....

متن کامل

Robust Estimators are Hard to Compute

In modern statistics, the robust estimation of parameters of a regression hyperplane is a central problem. Robustness means that the estimation is not or only slightly affected by outliers in the data. In this paper, it is shown that the following robust estimators are hard to compute: LMS, LQS, LTS, LTA, MCD, MVE, Constrained M estimator, Projection Depth (PD) and Stahel-Donoho. In addition, a...

متن کامل

A Two-Phase Robust Estimation of Process Dispersion Using M-estimator

Parameter estimation is the first step in constructing any control chart. Most estimators of mean and dispersion are sensitive to the presence of outliers. The data may be contaminated by outliers either locally or globally. The exciting robust estimators deal only with global contamination. In this paper a robust estimator for dispersion is proposed to reduce the effect of local contamination ...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003